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The thing that should not be: predictive coding 
and the uncanny valley in perceiving human 
and humanoid robot actions 
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Using functional magnetic resonance imaging (fMRI) repetition suppression, we explored the selectivity of the human action 
perception system (APS), which consists of temporal, parietal and frontal areas, for the appearance and/or motion of the 
perceived agent. Participants watched body movements of a human (biological appearance and movement), a robot (mechanical 
appearance and movement) or an android (biological appearance, mechanical movement). With the exception of extrastriate 
body area, which showed more suppression for human like appearance, the APS was not selective for appearance or motion 
perse. Instead, distinctive responses were found to the mismatch between appearance and motion: whereas suppression effects 
for the human and robot were similar to each other, they were stronger for the android, notably in bilateral anterior intraparietal 
sulcus, a key node in the APS. These results could reflect increased prediction error as the brain negotiates an agent that appears 
human, but does not move biologically, and help explain the 'uncanny valley' phenomenon. 

Keywords: functional magnetic resonance imaging (fMRI); repetition suppression; action perception; predictive coding; temporal 
cortex; anterior intraparietal sulcus; mirror neuron system; extrastriate body area 



INTRODUCTION 

Understanding others' movements and actions is important 
for many tasks of ecological significance, such as hunting 
prey, avoiding predators, communication and social inter- 
action. How humans and other animals achieve this has long 
been of interest in psychology and neuroscience (Blake and 
Shiffrar, 2007). In primates, perception of body movements 
is thought to be supported by a network including lateral 
superior temporal, inferior parietal and inferior frontal brain 
areas. Neuroimaging studies have shown responses in these 
areas during observation of actions; neuropsychological pa- 
tient and transcranial magnetic stimulation (TMS) studies 
have shown that damage or disruption of these areas can 
affect action processing (Saygin et al, 2004a; Pobric and 
Hamilton, 2006; Grafton and Hamilton, 2007 Saygin, 2007; 
Candidi et al, 2008). In non-human primates, at least two of 
these regions have been reported to contain 'mirror neu- 
rons', which fire during the execution as well as the 
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observation of specific movements (Rizzolatti and 
Craighero, 2004). Hence, this network is sometimes referred 
to as the mirror-neuron system (MNS). The exact relation- 
ship between mirror neurons and brain areas that support 
action perception in the human brain remains a topic of 
debate (e.g. Dinstein et ah, 2007; Chong et a/., 2008; Kilner 
et al, 2009; Mukamel et al, 2010). Accordingly, we will refer 
to the brain areas most commonly discussed in relation to 
action perception (i.e. lateral temporal, inferior frontal/ven- 
tral premotor and anterior intraparietal cortex) more neu- 
trally as the Action Perception System (APS), although of 
course action perception may involve other parts of the 
brain as well. 

Observed neural activity in the APS is often interpreted 
within the framework of motor resonance, whereby £ an 
action is understood when its observation causes the 
motor system of the observer to "resonate" ' (Rizzolatti 
et a/., 2001). But what are the boundary conditions for this 
resonance? How similar do the actors have to be with respect 
to the observer to engage resonance? 

On the one hand, it has been argued that closer the match 
between the observed action and the observers' own sensori- 
motor representations, the stronger the resonance should be. 
In support for this, there are links between activity within the 
APS and whether the observer can perform the seen move- 
ment (e.g. Calvo- Merino et al, 2006; Cross et ah, 2006; 
Candidi et al, 2008). The appearance of the observed agent 
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may also be important (Buccino et al, 2004; Chaminade 
et al, 2007). On the other hand, responses in the APS can 
appear surprisingly insensitive to the surface properties of the 
viewed action stimuli. For example, in the human brain, parts 
of the APS respond to actions and body movements of simple 
animations (Pelphrey et al, 2003) or to point-light displays 
(Saygin et al, 2004b). Indeed some researchers have sug- 
gested that the system is sensitive to the action's meaning, 
but is relatively insensitive to the surface properties of the 
sensory signals transmitting this information (Craighero 
et al, 2007). 

While humans have long been preoccupied with the 
theme of creating other entities in their likeness (e.g. dolls, 
marionettes, stories like the Golem, Frankenstein), with 
technological advances, artificial agents such as humanoid 
robots and 3D animated characters are becoming more 
and more commonly encountered in daily life (Coradeschi 
et al, 2006). Artificial agents can also provide scientists with 
unique opportunities to test theories of human perception 
and cognition. For example, robots can have appearance or 
movement kinematics that are not biological, but can never- 
theless be perceived as carrying out recognizable actions. 
They can thus be used to study the functional properties 
of the APS, such as whether the network is tuned selectively 
to human-like appearance, or biological motion. 

There is a small neuroscience literature on the perception 
of actions of artificial agents, including robots. 
Unfortunately, the results are not consistent to date. Some 
studies have reported that artificial agents' actions appar- 
ently affect the observers' own motor processing, or activity 
within the APS, whereas others have argued that the APS 
either does not respond, or responds weakly if the perceived 
actor is not human (e.g. Kilner et al, 2003; Tai et al, 2004; 
Chaminade and Hodgins, 2006; Catmur et al, 2007; 
Chaminade et al, 2007; Gazzola et al, 2007; Oberman 
et al, 2007; Press et al, 2007). The specific roles of biological 
appearance vs biological motion have not been sufficiently 
explored or separated in previous studies, even though this is 
a topic of increasing interest in robotics, neuroscience and 
vision science (MacDorman and Ishiguro, 2006; Chaminade 
et al, 2007, 2010; Kanda et al, 2008; Jastorff and Orban, 
2009; Saygin et al, 2010). 

In the present study, our stimuli and experimental design 
focused on whether the seen agent had biological (human- 
like) appearance and also whether the agent's body move- 
ments were biological, plus whether their appearance and 
movements matched. We also manipulated repetition of 
successive actions, as explained below. 

While our interest was focused on the APS, it was not 
limited to these regions alone. For example, the involvement 
of form processing in biological motion perception has also 
been supported (e.g. Lange and Lappe, 2006). Our methods 
allowed us to explore regions of the brain involved in body 
movement perception without limiting our focus to the 
nodes of the APS. 



A novel aspect of this study was that we used a recently 
developed, state-of-the-art android, 1 Repliee Q2. This was 
important for several reasons. First, we did not want to 
run the risk of using a robot that was not sufficiently an- 
thropomorphic (Perani et al, 2001; Tai et al, 2004). 
Furthermore, this and similar robots have 'presence' that 
generally cannot be elicited by computer-animated artificial 
agents ( Sanchez- Vives and Slater, 2005). 2 Finally, by using a 
state of the art robot, we can engage more productively with 
social robotics, a rapidly developing field (Dautenhahn, 
2007; Kahn et al, 2007). As artificial agents become part of 
our lives, appearing in a variety of domains from Hollywood 
movies and video games, through to clinical and educational 
settings (Aitkenhead and McDonald, 2006; Coradeschi et al, 
2006), research on how humans respond to such agents is 
increasingly important (Saygin et al, 2010). 

One key issue is what artificial agents should look like 
(MacDorman and Ishiguro, 2006; Seyama and Nagayama, 
2007; Kanda et al, 2008). There is a wide range in what 
people may consider as an animate agent, as exemplified 
by well-known robots from cinema: from HAL's single 
camera eye, R2D2, Wall-E and Eva, which become surpris- 
ingly expressive and likeable with simple but effective de- 
signs, to more and more humanoid appearances such as 
the Terminator, Robocop and the replicants of Blade Runner. 

It may seem like a good idea to make artificial agents look 
as human-like as possible, especially if they will be used in 
social settings. However, we soon encounter the 'uncanny 
valley': as an agent's appearance is made more human-like, 
people's disposition toward it becomes more positive, until a 
point at which increasing human-likeness leads to the agent 
being considered strange, unfamiliar and disconcerting. This 
phenomenon was prominently described in robotics (Mori, 
1970), although there are early 20th century references to 
related concepts ['unheimlich', Freud, 1919; Jentsch, 1995 
(1906)]. More recently, the uncanny valley has increasingly 
been experienced by the public when characters in movies or 
video games appear to be 'not quite right'. For example, 
many viewers found characters in the animated film Polar 
Express to be off-putting (Levi, 2004). Most modern an- 
droids, including Repliee Q2 used here, are also thought to 
fall into the uncanny valley (Ishiguro, 2006). Although the 
uncanny valley remains an influential concept due to sub- 
stantial anecdotal evidence, and its importance for the design 
of artificial agents, there has been little systematic explor- 
ation of the phenomenon or its neural basis (Seyama and 
Nagayama, 2007; MacDorman et al, 2009a; Steckenfinger 
and Ghazanfar, 2009). 

Here, we hypothesized that the uncanny valley may, at 
least partially, be caused by the violation of the brain's 

1 The word android originates from a Greek root meaning 'man'. This is a gender-specific root, but in present 
day English the usage is generally gender neutral. When possible, we promote the use of 'humanoid' to refer 
to artificial agents modeled after humans, but this word does not allow us to distinguish between our 
experimental conditions in the present article. 

2 The scanner setup only allowed us to show a video of the robots to the subjects. Thus, while our setup did 
not allow for full presence, we studied robots that have presence in their normal setting. 
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predictions: When an agent looks like a human, based on a 
lifetime of experience, the brain generates a prediction that 
this appearance will be associated with a particular kind of 
behavior (e.g. movement kinematics). When the behavior 
of the agent violates the prediction, an error is generated 
(see 'Discussion' section; Rao and Ballard, 1999; Friston, 
2010); although to be clear, prediction error is not the 
same thing as consciously experienced surprise (Friston, 
2005; Kiebel et al, 2009). 

A related computational framework is provided by work 
on internal models of motor control (Wolpert et al, 1995). 
When we perform an action, we predict the sensory conse- 
quences of that action through generative or forward models 
(Wolpert et al, 1995; Wolpert and Miall, 1996). These pre- 
dictions can be used to correct for unanticipated events, and 
to account for sensory noise and delays. The models can be 
recruited to infer the meaning of a perceived action given the 
sensory information (Wolpert et al, 2003). During percep- 
tion, the error between the prediction coming from in- 
ternal models and incoming visual information can be 
minimized by selecting models yielding accurate predictions, 
that therefore correspond to the observed action (Kilner 
et al, 2007). 

To summarize, we performed functional magnetic reson- 
ance imaging (fMRI) as participants viewed short video clips 
of human or robotic agents carrying out recognizable 
actions. To our knowledge, the present study is the first 
neuroimaging investigation of action observation that has 
used robots with different levels of humanoid appearance. 
We used the android Repliee Q2, which has a very human- 
like appearance. With brief exposures, Repliee Q2 can be 
mistaken for a human being, but existing evidence indicates 
an uncanny valley experience with more prolonged exposure 
(Ishiguro, 2006). Importantly, we showed clips of Repliee Q2 
both with its full human-like appearance, and also with a 
mechanical appearance, after stripping the robot of its 
human-like form, but retaining exactly the same mechan- 
ical movements. We also showed clips of the real human 
that Repliee Q2 was designed to replicate in appearance 
(Figure 1). There were thus three Agent conditions: 
Human, Android and Robot, which relate to our 



experimental interests of appearance and motion as follows: 
human and Android conditions feature biological (i.e. 
human-like) surface appearance, whereas the Robot condi- 
tion features a mechanical appearance. In terms of motion, 
the Android and Robot feature nonhuman motion, whereas 
biological motion is unique to the Human condition. In this 
scheme, the Robot and the Human are different from each 
other in both dimensions, while sharing a feature with the 
Android. But from another perspective, the Robot and 
the Human conditions are similar in that they both feature 
congruent appearance and motion (looks human, moves 
human; looks mechanical, moves mechanical) whereas the 
Android features mismatching or incongruent appearance 
and motion (looks human, moves mechanical). 

One limitation for most fMRI studies on this topic to date 
is that they compared the overall level of BOLD signal across 
conditions. fMRI can be used to allow more refined infer- 
ences regarding the neural representations underlying the 
measured activity. A well-established approach involves 
repetition suppression (also called fMRI adaptation): this 
method has its origins in neurophysiology, and refers to 
the phenomena of reduced neural response to a repeated 
stimulus (Henson and Rugg, 2003; Grill-Spector et al, 
2006; Krekelberg et al, 2006). Repetition is thought to lead 
to such reduced responses only in neurons selective for the 
repeated properties, which allows the technique to be used as 
a means to explore what is represented in a particular brain 
region (e.g. motion direction sensitivity in area MT/V5 
(Bartels et al, 2008)). Repetition suppression effects are 
thought to reflect stimulus processing rather than task de- 
mands (Xu et al, 2007) and observed attentional modula- 
tions are not generic (Thompson and Duncan, 2009). 
Recently, the repetition suppression approach has begun to 
be applied to the study of action perception to identify func- 
tional properties of the APS (e.g. Hamilton and Grafton, 
2006, 2008; Dinstein et al, 2007; Chong et al, 2008; Fujii 
et al, 2008; Lestou et al, 2008; Kilner et al, 2009). 

The repetition suppression approach was ideally suited to 
our goals. BOLD differences for the experimental conditions 
(e.g. a main effect of Agent) can arise due to a number of 
low-level stimulus factors such as differences in illumination, 




Fig. 1 Still images from the videos used in the experiment, depicting the agents. (A) Robot, (B) Android and (C) Human. 
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spatial frequency, color, or contrast that have little or noth- 
ing to do with action processing, as well as nonspecific at- 
tention or arousal effects. Instead, we focused on the 
interaction between repetition suppression and our experi- 
mental conditions. 

METHODS 
Participants 

Twenty healthy adults (aged 20-36 years) participated. Data 
from one participant could not be used due to excessive head 
movement. All participants had normal or corrected vision, 
no cognitive, attentional or neurological abnormalities by 
self-report, and were right-handed. All participants gave 
written informed consent in accordance with local ethics 
approval. 

Stimuli 

Stimuli were video clips of actions performed by Repliee Q2 
(in Android or Robot appearance, Figure 1A and B) and by 
the human 'master', after whom Repliee Q2 was modeled 
(Figure 1C). We refer to these agents as the Android, Robot 
and Human conditions (even though the former two are in 
fact the same robot). 

Repliee Q2 has 42 degrees of freedom and can make head 
and upper body movements. In its existing implementation, 
it is impossible for this machine to exactly match the dy- 
namics of human body movement (Pollick et ah, 2005). The 
actuators for Repliee Q2 were programmed over several 
weeks at Osaka University. The same movements were 
videotaped in two appearance conditions. For the Robot 
condition, we removed as many of the surface elements of 
Repliee Q2 as possible to reveal the materials underneath 
(e.g. wiring, metal arms and joints). The silicone 'skin' on 
the hands and face and some of the fine hair around the face 
could not be removed and was covered. In the Robot con- 
dition, Repliee Q2 could no longer be mistaken for a human 
(Figure 1A). 

Crucially, the kinematics of the movement for the 
Android and Robot conditions were identical, since these 
conditions in fact comprised the same robot, carrying out 
the very same, programmed movements. 

For the Human condition, we videotaped the female adult 
whose face was molded and used in constructing Repliee Q2. 
She was asked to watch each of the Repliee Q2's actions and 
then perform the same action naturally. 

All agents were videotaped in the same room and with the 
same background. A total of eight actions per actor were 
used in the fMRI experiment, including both transitive 
(drinking water from a cup, picking up a piece of paper from 
a table, grasping a tube of hand lotion, wiping a table with a 
cloth) and intransitive actions [waving hand, nodding 
affirmatively, shaking head (to convey no) and introducing 
self (Japanese bow)]. Video recordings were digitized, con- 
verted to grayscale and cropped to 400 x 400 pixels. 



A semi-transparent white fixation cross (40 pixels across) 
was superimposed at the center of the movies. 

Experimental procedures and data analysis 

MATLAB (Mathworks, Natick, MA, USA) and the Cogent 
toolbox (www.vislab.ucl.ac.uk/Cogent) were used for stimu- 
lus presentation and response collection. 

Each participant was given exactly the same introduction 
to the study and the same exposure to the videos prior to 
scanning, because prior knowledge can affect judgments of 
artificial agents differentially (Saygin and Cicekli, 2002). To 
minimize possible effects of familiarity or expertise on our 
results, we only recruited participants who had no experi- 
ence working with robots, had not spent time in Japan, nor 
had close friends or family from Japan (MacDorman et al, 
2009b). At the start of the study, subjects viewed each 
movie once outside of the scanner, and were told whether 
each agent was a human or a robot. They were not uncertain 
about the identity of the android by the time scanning took 
place. 

Each participant was scanned in 6 445-s runs of the ex- 
periment, each comprising 12 blocks. Each block contained 
12 videos from Human, Android or Robot conditions, pre- 
sented in blocked counterbalanced order. Repetitions were 
event related. Videos were 2 s long and were separated by 
500 ms. Each clip was equiprobably a repeat of the previous 
clip or a nonrepeat. Repetition intervals were kept constant 
between the conditions. Repetition suppression was calcu- 
lated as the difference between BOLD response to a new 
(nonrepeated) stimuli compared with the response to the 
same stimulus when it was repeated. Positive suppression 
means there was less response to repeated stimuli. 
Additional illustrations of the experiment are shown in 
Supplementary Figure SI. 

To ensure sustained attention, every 30 s, participants were 
presented with a written statement about which they had to 
make a True/False judgment (e.g. £ I did not see her waving') 
using an MRI compatible keypad. Participants had a 
maximum of 4 s to respond to each statement. We explored 
with a repeated measures analysis of variance (ANOVA) 
whether accuracy varied across conditions (it did not). 
Participants were instructed to keep their eyes on the fix- 
ation cross as much as possible, except at the end of the 
blocks when they read the statements. We used an 
MR- compatible eye tracker (see Supplementary Data) to 
check whether eye movements differed between conditions 
(they did not). 

We used a 3T Siemens Allegra scanner and a standard 
gradient echo pulse sequence. fMRI data were analysed 
with SPM 5 (http://www.fil.ion.ucl.ac.uk/spm) using stand- 
ard procedures (see Supplementary Data for details). 
Although there is no agreed-upon localizer for the APS 
(Grafton and Hamilton, 2007), we selected regions of inter- 
est (ROIs), while also avoiding nonindependence errors 
(Kriegeskorte et al, 2009) using the main effect of Repetition. 
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Anatomical description 


BA 


Peak (M 


Nl) 




Z 


Mean RS (% Signal) 




Agent differences 






X 


y 


z 




Robot 


Android 


Human 




Temporal cortex 




















Lateral temporal cortex (EBA) 


37, 22 


-48 


-72 


6 


7.47 


0.51 


1.20 


1.08 


Agent x repetition (P = 0.03) 

H>R (P = 0.07) 
A > R [P= 0.02) 






50 


-64 


0 


6.94 


0.85 


1.07 


0.95 


None 


Fusiform gyrus 




46 


-44 


-16 


4.02 


0.22 


0.87 


0.41 


Agent x repetition (P = 0.075) 
A>R (P = om) 

A > H = 0.05) 






-44 


-44 


-18 


3.54 


0.21 


0.69 


0.42 


A>R (P = 0.06) 


Occipital cortex 




















V1/V2 


17, 18 


16 


-88 


2 


5.86 


-0.79 


-0.88 


-0.73 


None 






-12 


-94 


2 


5.73 


-0.81 


-0.87 


-0.85 


None 


Parietal cortex 




















sIPS 


7 


18 


-72 


60 


4.93 


0.29 


0.88 


0.51 


A > R (P = 0.03) 






-20 


-70 


60 


4.57 


0.23 


1.06 


0.55 


Agent x repetition (P = 0.04) 

A>R (/> = 0.01) 
A>H (P = 0A) 


alPS 


40 


-42 


-38 


42 


3.96 


0.30 


0.81 


0.42 


Agent x repetition (P = 0.002) 

A > R (P = 0.002) 
A > H (P = 0.004) 






42 


-36 


42 


4.37 


0.22 


0.93 


0.39 


Agent x repetition [P = 0.002) 

A > R (P = 0.003) 
A > H = 0.02) 


Cuneus (pIPS) 


19 


24 


-84 


44 


3.51 


0.41 


0.38 


0.48 


None 






-28 


-82 


40 


3.63 


0.41 


0.41 


0.39 


None 


Frontal cortex 




















Middle Frontal Gyrus 


10 


-44 


52 


12 


4.12 


0.66 


0.57 


0.27 


None 




10 


46 


50 


-12 


4.10 


0.46 


0.41 


0.55 


None 




46 


50 


48 


10 


3.93 


U.JO 


u.ou 


0.22 


A>H (P = 0.07) 




6 


44 


8 


54 


3.91 


0.27 


0.58 


0.64 


None 


Other 




















Parahippocampal/Amygdala 




26 


-4 


-18 


3.99 


0.10 


0.68 


0.75 


A>R (P = 0.09) 
H>R (P = 0.08) 






-28 


-6 


-20 


3.69 


0.20 


0.58 


0.66 


None 


Temporoparietal junction (TPJ) 


40 


60 


-40 


26 


3.83 


0.29 


0.54 


0.57 


None 




40, 13 


-48 


-38 


28 


3.43 


0.54 


0.44 


0.27 


None 


Cerebellum 




8 


-44 


-18 


3.69 


0.48 


0.49 


0.46 


None 


Paracentral 


5 


4 


-36 


70 


3.64 


0.19 


0.52 


0.50 


None 


Postcentral gyrus 


3 


70 


-8 


24 


3.45 


0.41 


0.45 


0.35 


None 



Anatomical description and Brodmann Areas (BA) and the peak MNI coordinates are reported for each region in which the main effect of RS was significant < 0.05, FDR 
corrected and minimum cluster size of 30 voxels). Mean repetition suppression (percentage of signal change for Nonrepeat— Repeat, see 'Methods' section) for the three agents at 
these peaks are reported, along with any significant statistical differences (as measured using repeated measures ANOVA). We also noted pair-wise agent differences that were 
significant {P< 0.05 corrected, two tailed), and in italics, those that fell short of significance but with a tendency [P< 0.1, corrected, two tailed, denoted in italics). Significant 
Agent by Repetition interactions are marked in bold, and are also plotted in Figure 3. 



Focusing on brain areas that are sensitive to action repetition, 
we explored contrasts of interest (Agent by Repetition inter- 
action). We identified regions showing repetition sup- 
pression (Nonrepeat > Repeat) at f>8.86; P<0.05 false 
discovery rate (FDR) corrected, with a cluster size of at 
least 30 voxels (Table 1), and extracted percent signal change 
within a sphere of 5 mm radius around these peaks for each 
condition from each subject's first level analysis, and tested 
the Agent by Repetition interaction with an ANOVA. In a 
balanced factorial design with equiprobable conditions 
(as was used here), this process does not bias the chances 
of finding an interaction (Kriegeskorte et al, 2009). 



In reporting the effects, P-values were calculated two- 
tailed, and were corrected for multiple comparisons. 

RESULTS 

Behavioral and eye movement data 

Mean Accuracy for the comprehension questions was 0.84 
(s.d. = 0.28). Accuracy did not differ between conditions 
(P>0.1 for all pair-wise comparisons). None of the eye 
movement measures (Mean and s.d. of x and y position^ 
Pupil size) differed between conditions (P>0.1 for all pair- 
wise comparisons). These data indicate comparisons across 
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conditions that were not subject to gross attention or eye 
movement confounds. 



FMRI data 

There were notable differences in repetition suppression be- 
tween the agents, with the Human and Robot conditions 
leading to similar patterns of suppression, but the Android 
condition being distinctive, and leading to repetition sup- 
pression in a wider network (Figure 2). All agent conditions 
revealed repetition suppression in lateral temporal cortex. 
For the Android condition, repetition suppression was also 
evident in additional regions, notably in parietal and frontal 
cortex. 

To confirm and quantify these results, we performed ROI 
analyses. Broadly consistent with previous repetition sup- 
pression studies of action perception, the main effect of 
Repetition revealed a network of areas, including occipital, 
lateral and ventral temporal, parietal, frontal, parahippocam- 
pal and cerebellar regions (Figure 3 and Table 1). All showed 
reduced responses to the repeated stimuli, with the exception 
of primary visual cortex, which showed repetition enhance- 
ment. Repetition suppression was found in the parietal and 
temporal nodes of the APS, but despite being a key node of 
the APS, ventral premotor cortex did not show significant 



repetition suppression (cf Chong et al, 2008; Lestou et al, 
2008; Grossman et al, 2010). There were other repetition 
suppression foci in frontal cortex, including one that 
extended into dorsal premotor cortex. 

Since our main interest was differential responses to the 
three agents (and the stimulus dimensions they represent, 
i.e. biological appearance and motion), we tested whether 
the repeated measures ANOVA revealed a significant Agent 
by Repetition interaction. Even though there was qualitative- 
ly more suppression for the Android condition compared 
with the others in a widespread network (Figure 2), the 
Agent by Repetition interaction reached significance in 
only a subset of these regions (Table 1). 

Figure 3 depicts repetition suppression as percent signal 
change in the peaks where the interaction was significant. 
In three parietal peaks, suppression was stronger for the 
Android condition than for the Human and Robot con- 
ditions: anterior intraparietal sulcus bilaterally (alPS, 
Figure 3 A and C), and a more posterior and superior parietal 
region (sIPS, Figure 3B) in the left hemisphere. 

The Agent by Repetition interaction was also significant in 
left lateral temporal cortex, where we observed greater repe- 
tition suppression for the Human and Android conditions 
than for the Robot condition (Figure 3D). There was a large 




Fig. 2 Repetition suppression. Whole-brain repetition suppression effect for (A) Robot, (B) Android and (C) Human conditions rendered on the lateral views of the cortical surface 
of each hemisphere. 
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swathe of suppression covering multiple functional visual 
areas, but the interaction was present only in one subpeak, 
the coordinates of which corresponded to the previously 
reported location of the extrastriate body area (EBA) 
(Peelen and Downing, 2007), a region that responds more 
strongly to images of bodies and body parts compared with 
other kinds of stimuli. The right hemisphere peak, and a 
more dorsal subpeak of the left hemisphere cluster, did not 
show significant differences between agents. 

For completeness, we also report the main effect of Agent: 
this effect was found in visual cortex bilaterally (with peaks 
in MNI coordinates -30, -92, 2 and 38, -80, -16), and was 
driven by a stronger response for the Robot condition com- 
pared with the other agents. These differences almost cer- 
tainly reflect low-level visual differences between the stimuli 
(e.g. higher contrast, spatial frequency), demonstrating the 
advantage of using a repetition paradigm. 

DISCUSSION 

Summary of study and findings 

We conducted this study as part of our general goal of iden- 
tifying the functional properties of brain systems that allow 
us to understand others' body movements and actions 
(Saygin et ah, 2004b; Saygin, 2007). Subjects viewed actions 
performed by three agents that represented our experimental 
factors of interest: Human (biological motion and appear- 
ance), Android (biological appearance, nonbiological 
motion) and Robot (same agent as the Android, but 
'skinned' to reveal the internal mechanics, nonbiological ap- 
pearance and motion). 

There was little evidence for specificity for biological 
motion or appearance per se in our data. Even though 
the nervous system processes form and motion in partially 



segregated systems, these attributes are inextricably intercon- 
nected (Shepard, 2001) and for action perception, the inte- 
gration of motion and form cues may be a natural and 
critical aspect of the underlying computations. 

There was a significant Agent by Repetition interaction in 
the anterior portion of the intraparietal sulcus bilaterally, 
corresponding to area alPS, the putative human homologue 
of macaque area AIP (Grefkes and Fink, 2005; Culham and 
Valyear, 2006; Grafton and Hamilton, 2007). Here, suppres- 
sion effects were larger for the Android compared with both 
the Human and the Robot conditions (Figure 3). 

We found one region in left posterior lateral temporal 
cortex, where suppression for the Robot condition was sig- 
nificantly less than that for the human and the android, the 
two agents with human-like surface appearance. The peak 
location in this cluster corresponded the EBA (Peelen and 
Downing, 2007), consistent with the role of form-based pro- 
cessing in action perception (e.g. Lange and Lappe, 2006). 

Predictive coding 

We suggest that our results, especially the distinctive effects 
for the Android condition, can be reconciled with the 'pre- 
dictive coding' framework of neural processing (e.g. Rao and 
Ballard, 1999; Friston, 2005, 2010; Kilner et a/., 2007; Jakobs 
et al, 2009), which is based on minimization of prediction 
error among the levels of a cortical hierarchy. The key idea in 
this context is that brain activity will be higher for a stimulus 
that is not well-predicted or explained by a generative neural 
model of the external causes for sensory states (Friston, 
2010). Predictive coding fits well with the view of perception 
as an active process that involves generating predictions 
about the environment, as well as the brain's own states 
(e.g. Yuille and Kersten, 2006; Bar, 2009; Barsalou, 2009). 




A.I_eftalPS B.LeftslPS C. Right alPS D. Left EBA 



Fig. 3 Interactions. The top panel shows the main effect of Repetition (irrespective of Agent) rendered on the lateral views of the cortical hemispheres. The graphs depict the 
repetition suppression effect in all the peaks in which there was a significant interaction of Repetition by Agent (see Table 1 for statistics). K-axes are percent signal change 
(Nonrepeat - Repeat). 
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We have a lifetime of experience that associates human 
appearance with biological motion, and machines (such as 
robots) with mechanical motion. For both our Human con- 
dition and our Robot condition, the observed motion kine- 
matics was congruent with what would be predicted from 
the appearance of the agent. For the Android, however, there 
was a mismatch between the human-like appearance and the 
mechanical motion, leading to a larger prediction error, 
manifest as activity in relevant brain regions. A closer look 
at the data showed that responses to the nonrepeated videos 
were significantly greater for the Android compared with the 
other agents (Supplementary Figure S2), further supporting 
this interpretation. The prediction error would be smaller 
when a stimulus was preceded by the same stimulus, con- 
sistent with neural models of repetition suppression 
(Desimone and Duncan, 1995; Friston, 2005; Grill-Spector 
et al, 2006). 

The differences between agent types for repetition effects 
were most pronounced in parietal cortex. The alPS, being 
the anatomical link between the posterior, visual compo- 
nents of the APS and the anterior, motor components 
(Petrides and Pandya, 1988; Seltzer and Pandya, 1994; 
Matelli and Luppino, 2001; Rozzi et al, 2006), is ideally 
located to generate sensory predictions in this network. To 
describe the flow of information in the system, more time- 
resolved measurement techniques should be used, such as 
electroencephalography (EEG) or magnetoencephalography 
(MEG), as we are doing in related work. 

Predictive coding not only provides a satisfactory inter- 
pretation of the current data, but also couches them in a 
framework that has both established and growing support 
in neuroscience (Rao and Ballard, 1999; Friston, 2005; Bar, 
2009). We speculate that the present results reflect relatively 
general principles of neural organization, but also that the 
prediction errors may be dependent on how narrowly tuned 
the nervous system is for a particular domain. Future 
work should explore whether the perception of our conspe- 
cifics is an especially narrowly tuned domain, based on its 
evolutionary importance, and/or our extensive experience 
of interacting with conspecifics. 

Contribution to the understanding of the 
uncanny valley 

The uncanny valley has many potential dimensions 
(MacDorman and Ishiguro, 2006; Ho and MacDorman, 
2010; Pollick, 2009). Our experiments and similar studies 
(e.g. Steckenfinger and Ghazanfar, 2009) were not designed 
in an optimum fashion to 'explain' the uncanny valley and as 
such can only make a modest contribution to defining its 
neural basis. However, the present results suggest an intri- 
guing link between brain responses in the APS and the un- 
canny valley. While the android used in our study is often 
mistaken for a human at first sight, longer exposure and 
dynamic viewing has been linked to the uncanny valley 
(Ishiguro, 2006). In a predictive coding account of action 



perception, the android is not predictable— an agent with 
that appearance (human) would typically not move mech- 
anically. When the nervous system is presented with 'the 
thing that should not be' [Lovecraft, 1984 (1936); Hetfield 
et al, 1986], a propagation of prediction error may occur in 
the APS. While we cannot state a conclusive or causal link 
between prediction error and the uncanny valley based on 
the present data, we suggest this framework may contribute 
to an explanation for the uncanny valley. 

Toward an interdisciplinary science of social 
perception 

Humanoid robots and artificial agents are increasingly part 
of our daily lives (Kanda et al, 2004; Dautenhahn, 2007; 
Tapus et al, 2007). With application in domains such as 
healthcare, education, communications, entertainment and 
the arts, exploring human factors in the design and develop- 
ment of artificial agents is ever more important. This will 
require an interdisciplinary approach, to which we have con- 
tributed new data from cognitive neuroscience. 

The present study is only a beginning. Computational 
modeling, ideally in conjunction with neuroimaging, will 
be important to specify or constrain the mechanisms under- 
lying action perception, and to link this work with estab- 
lished frameworks of sensorimotor control (Wolpert et ah, 
1995, 2003; Kawato, 1999; Kilner et al, 2007). Predictive 
coding can be used to specify new hypotheses to explore 
further the interplay between appearance and motion of arti- 
ficial agents, and to extend the approach to sensory integra- 
tion more broadly. For example, it is possible that we have 
some prior idea of how robots should move— perhaps as 
evidenced by professionals making money by painting them- 
selves gold, standing in front of cathedrals and moving like 
robots— and similar patterns of prediction errors for viewed 
actions might be generated for humans moving like robots 
(cf Shimada, 2010) or more generally, for other kinds of 
expectation violations between appearance and motion. 
Alternatively, the effects observed here could be specific to 
the perception of animate, or biologically relevant entities. 
Computer animation will be used to manipulate appearance 
and movement more parametrically and address these and 
similar questions in future work. 

Despite many unknowns, our results already suggest an 
interpretation for the classic anecdotal reports of the un- 
canny valley effect. Psychologists have long pointed out 
those aspects of our physical experience that shape our per- 
ceptual systems (Gibson, 1979; Barlow, 2001). It has also 
long been acknowledged that violating perceptual expect- 
ations can have striking effects, compellingly illustrated by 
perceptual illusions (e.g. Gregory, 1980). As human-like arti- 
ficial agents become more commonplace, perhaps our per- 
ceptual systems will be retuned to accommodate these new 
social partners. Or perhaps, we will decide it is not a good 
idea to make them so closely in our image after all. 
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